61 research outputs found
OpenDMAP: An open source, ontology-driven concept analysis engine, with applications to capturing knowledge regarding protein transport, protein interactions and cell-type-specific gene expression
<p>Abstract</p> <p>Background</p> <p>Information extraction (IE) efforts are widely acknowledged to be important in harnessing the rapid advance of biomedical knowledge, particularly in areas where important factual information is published in a diverse literature. Here we report on the design, implementation and several evaluations of OpenDMAP, an ontology-driven, integrated concept analysis system. It significantly advances the state of the art in information extraction by leveraging knowledge in ontological resources, integrating diverse text processing applications, and using an expanded pattern language that allows the mixing of syntactic and semantic elements and variable ordering.</p> <p>Results</p> <p>OpenDMAP information extraction systems were produced for extracting protein transport assertions (transport), protein-protein interaction assertions (interaction) and assertions that a gene is expressed in a cell type (expression). Evaluations were performed on each system, resulting in F-scores ranging from .26 – .72 (precision .39 – .85, recall .16 – .85). Additionally, each of these systems was run over all abstracts in MEDLINE, producing a total of 72,460 transport instances, 265,795 interaction instances and 176,153 expression instances. </p> <p>Conclusion</p> <p>OpenDMAP advances the performance standards for extracting protein-protein interaction predications from the full texts of biomedical research articles. Furthermore, this level of performance appears to generalize to other information extraction tasks, including extracting information about predicates of more than two arguments. The output of the information extraction system is always constructed from elements of an ontology, ensuring that the knowledge representation is grounded with respect to a carefully constructed model of reality. The results of these efforts can be used to increase the efficiency of manual curation efforts and to provide additional features in systems that integrate multiple sources for information extraction. The open source OpenDMAP code library is freely available at <url>http://bionlp.sourceforge.net/</url></p
Iron Behaving Badly: Inappropriate Iron Chelation as a Major Contributor to the Aetiology of Vascular and Other Progressive Inflammatory and Degenerative Diseases
The production of peroxide and superoxide is an inevitable consequence of
aerobic metabolism, and while these particular "reactive oxygen species" (ROSs)
can exhibit a number of biological effects, they are not of themselves
excessively reactive and thus they are not especially damaging at physiological
concentrations. However, their reactions with poorly liganded iron species can
lead to the catalytic production of the very reactive and dangerous hydroxyl
radical, which is exceptionally damaging, and a major cause of chronic
inflammation. We review the considerable and wide-ranging evidence for the
involvement of this combination of (su)peroxide and poorly liganded iron in a
large number of physiological and indeed pathological processes and
inflammatory disorders, especially those involving the progressive degradation
of cellular and organismal performance. These diseases share a great many
similarities and thus might be considered to have a common cause (i.e.
iron-catalysed free radical and especially hydroxyl radical generation). The
studies reviewed include those focused on a series of cardiovascular, metabolic
and neurological diseases, where iron can be found at the sites of plaques and
lesions, as well as studies showing the significance of iron to aging and
longevity. The effective chelation of iron by natural or synthetic ligands is
thus of major physiological (and potentially therapeutic) importance. As
systems properties, we need to recognise that physiological observables have
multiple molecular causes, and studying them in isolation leads to inconsistent
patterns of apparent causality when it is the simultaneous combination of
multiple factors that is responsible. This explains, for instance, the
decidedly mixed effects of antioxidants that have been observed, etc...Comment: 159 pages, including 9 Figs and 2184 reference
An annotated corpus with nanomedicine and pharmacokinetic parameters
Nastassja A Lewinski,1 Ivan Jimenez,1 Bridget T McInnes2 1Department of Chemical and Life Science Engineering, Virginia Commonwealth University, Richmond, VA, 2Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA Abstract: A vast amount of data on nanomedicines is being generated and published, and natural language processing (NLP) approaches can automate the extraction of unstructured text-based data. Annotated corpora are a key resource for NLP and information extraction methods which employ machine learning. Although corpora are available for pharmaceuticals, resources for nanomedicines and nanotechnology are still limited. To foster nanotechnology text mining (NanoNLP) efforts, we have constructed a corpus of annotated drug product inserts taken from the US Food and Drug Administration’s Drugs@FDA online database. In this work, we present the development of the Engineered Nanomedicine Database corpus to support the evaluation of nanomedicine entity extraction. The data were manually annotated for 21 entity mentions consisting of nanomedicine physicochemical characterization, exposure, and biologic response information of 41 Food and Drug Administration-approved nanomedicines. We evaluate the reliability of the manual annotations and demonstrate the use of the corpus by evaluating two state-of-the-art named entity extraction systems, OpenNLP and Stanford NER. The annotated corpus is available open source and, based on these results, guidelines and suggestions for future development of additional nanomedicine corpora are provided. Keywords: nanotechnology, informatics, natural language processing, text mining, corpor
- …